Add safe_numel()#19074
Conversation
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19074
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ⏳ 1 Pending, 2 Unrelated FailuresAs of commit ae06ce8 with merge base eef7921 ( BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
JacobSzwejbka
left a comment
There was a problem hiding this comment.
Please swap to just using ET_CHECK(safe_numel()) like we talked about)
…w()" Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
…w()" Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Pull Request resolved: #19074 Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. ghstack-source-id: 372628309 @exported-using-ghexport Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/)
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
Introduces two overflow-checked numel helpers as experimental API additions. - safe_numel(sizes, dim): returns Result; use this where possible - compute_numel_overflow(sizes, dim): aborts on overflow; use this in ctors and other places that cannot return result. Long-term change is to migrate fully to safe_numel. Currently, we have compute_numel, which silently overflows. Authored with Claude. Differential Revision: [D102070375](https://our.internmc.facebook.com/intern/diff/D102070375/) [ghstack-poisoned]
There was a problem hiding this comment.
Pull request overview
Adds a new overflow-checked safe_numel() helper to compute tensor element counts without silent integer overflow, and wires it into both portable (executor) and ATen-mode type shims so callers can begin migrating to checked numel computations.
Changes:
- Introduce
safe_numel()returningResult<ssize_t>for portable tensors (TensorImpl::SizesType). - Add an equivalent
safe_numel()inline helper inexec_aten.hfor ATen mode. - Update exec_aten Buck target deps to export core runtime headers needed by the new API surface.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| runtime/core/portable_type/tensor_impl.h | Declares safe_numel() in the portable tensor API surface and re-exports it via torch::executor aliases. |
| runtime/core/portable_type/tensor_impl.cpp | Implements overflow-checked safe_numel() for portable tensors using c10::mul_overflows. |
| runtime/core/exec_aten/targets.bzl | Adds exported deps so exec_aten.h can include error.h / result.h and related core headers. |
| runtime/core/exec_aten/exec_aten.h | Adds an ATen-mode inline safe_numel() returning Result<ssize_t>. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ssize_t next_numel; | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| !c10::mul_overflows(numel, static_cast<ssize_t>(sizes[i]), &next_numel), | ||
| InvalidArgument, | ||
| "Overflow computing numel at dimension %zd", | ||
| i); | ||
| numel = next_numel; |
There was a problem hiding this comment.
safe_numel() uses c10::mul_overflows() with signed ssize_t. In our portable_type c10 implementation, the MSVC fallback path computes volatile T tmp = a * b for signed T, which can itself trigger signed-overflow UB when the product doesn’t fit. This undermines the overflow checking and can be exploitable for attacker-controlled shapes on MSVC builds. Prefer an overflow check that avoids signed multiplication (e.g., since sizes are validated non-negative, use a size_t/uint64_t accumulator with if (a > SSIZE_MAX / b) bounds checking, or use an unsigned safe-multiply helper and then range-check back to ssize_t).
| ET_CHECK_OR_RETURN_ERROR( | ||
| dim == 0 || sizes != nullptr, | ||
| InvalidArgument, | ||
| "Sizes must be provided for non-scalar tensors"); | ||
| ssize_t numel = 1; | ||
| for (const auto i : c10::irange(dim)) { |
There was a problem hiding this comment.
safe_numel() doesn’t reject negative dim. With c10::irange(dim), a negative dim produces an empty range, so the function returns 1 instead of an error. Since this helper is intended for validating potentially malformed/untrusted shape metadata, it should return InvalidArgument when dim < 0 (before the loop).
| ::executorch::runtime::Result<ssize_t> safe_numel( | ||
| const TensorImpl::SizesType* sizes, | ||
| ssize_t dim) { | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| dim == 0 || sizes != nullptr, | ||
| InvalidArgument, | ||
| "Sizes must be provided for non-scalar tensors"); | ||
| ssize_t numel = 1; | ||
| for (const auto i : c10::irange(dim)) { | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| sizes[i] >= 0, | ||
| InvalidArgument, | ||
| "Size must be non-negative, got %zd at dimension %zd", | ||
| static_cast<ssize_t>(sizes[i]), | ||
| i); | ||
| ssize_t next_numel; | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| !c10::mul_overflows(numel, static_cast<ssize_t>(sizes[i]), &next_numel), | ||
| InvalidArgument, | ||
| "Overflow computing numel at dimension %zd", | ||
| i); | ||
| numel = next_numel; | ||
| } | ||
| return numel; | ||
| } |
There was a problem hiding this comment.
There are existing portable_type tensor_impl unit tests, but safe_numel() introduces new behavior (overflow detection, negative sizes, null sizes for dim>0). Please add focused tests covering: scalar (dim==0, sizes==nullptr), negative dim, negative size, and an overflow case near SSIZE_MAX to ensure the new checks behave as expected.
| ET_CHECK_OR_RETURN_ERROR( | ||
| dim == 0 || sizes != nullptr, | ||
| InvalidArgument, | ||
| "Sizes must be provided for non-scalar tensors"); | ||
| ssize_t numel = 1; | ||
| for (ssize_t i = 0; i < dim; i++) { |
There was a problem hiding this comment.
exec_aten::safe_numel() doesn’t reject negative dim. For malformed metadata, a negative dim will skip the loop and return 1, which is indistinguishable from a scalar. Consider explicitly returning InvalidArgument when dim < 0 (in addition to the existing sizes!=nullptr check for dim>0).
| ET_CHECK_OR_RETURN_ERROR( | ||
| sizes[i] >= 0, | ||
| InvalidArgument, | ||
| "Size must be non-negative, got %zd at dimension %zd", | ||
| static_cast<ssize_t>(sizes[i]), | ||
| i); | ||
| ssize_t next_numel; | ||
| ET_CHECK_OR_RETURN_ERROR( | ||
| !c10::mul_overflows(numel, static_cast<ssize_t>(sizes[i]), &next_numel), | ||
| InvalidArgument, |
There was a problem hiding this comment.
In ATen mode, SizesType is int64_t but safe_numel() casts each size to ssize_t before checking overflow. On platforms where ssize_t is narrower than int64_t (or if a size is larger than SSIZE_MAX), this cast is implementation-defined and can truncate/flip sign, potentially letting invalid shapes slip through or producing incorrect results. Add an explicit range check that sizes[i] <= SSIZE_MAX (and >= 0) before converting, or do the multiplication in a wider unsigned type and validate the final result fits in ssize_t.
Stack from ghstack (oldest at bottom):
Currently, we have compute_numel, which silently overflows.
Authored with Claude.
Differential Revision: D102070375